57 research outputs found

    A Combinatorial Optimization Approach to the Selection of Statistical Units

    Get PDF
    In the case of some large statistical surveys, the set of units that will constitute the scope of the survey must be selected. We focus on the real case of a Census of Agriculture, where the units are farms. Surveying each unit has a cost and brings a different portion of the whole information. In this case, one wants to determine a subset of units producing the minimum total cost for being surveyed and representing at least a certain portion of the total information. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The proposed approach is based on combinatorial optimization, and the arising decision problems are modeled as multidimensional binary knapsack problems. Experimental results show the effectiveness of the proposed approach

    Identifying e-Commerce in Enterprises by means of Text Mining and Classification Algorithms

    Get PDF
    Monitoring specific features of the enterprises, for example, the adoption of e-commerce, is an important and basic task for several economic activities. This type of information is usually obtained by means of surveys, which are costly due to the amount of personnel involved in the task. An automatic detection of this information would allow consistent savings. This can actually be performed by relying on computer engineering, since in general this information is publicly available on-line through the corporate websites. This work describes how to convert the detection of e-commerce into a supervised classification problem, where each record is obtained from the automatic analysis of one corporate website, and the class is the presence or the absence of e-commerce facilities. The automatic generation of similar data records requires the use of several Text Mining phases; in particular we compare six strategies based on the selection of best words and best n-grams. After this, we classify the obtained dataset by means of four classification algorithms: Support Vector Machines; Random Forest; Statistical and Logical Analysis of Data; Logistic Classifier. This turns out to be a difficult case of classification problem. However, after a careful design and set-up of the whole procedure, the results on a practical case of Italian enterprises are encouraging

    A min-cut approach to functional regionalization, with a case study of the Italian local labour market areas

    Get PDF
    In several economical, statistical and geographical applications, a territory must be subdivided into functional regions. Such regions are not fixed and politically delimited, but should be identified by analyzing the interactions among all its constituent localities. This is a very delicate and important task, that often turns out to be computationally difficult. In this work we propose an innovative approach to this problem based on the solution of minimum cut problems over an undirected graph called here transitions graph. The proposed procedure guarantees that the obtained regions satisfy all the statistical conditions required when considering this type of problems. Results on real-world instances show the effectiveness of the proposed approach

    Logical analysis of data as a tool for the analysis of probabilistic discrete choice behavior

    Get PDF
    Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics

    Exploring the Potentialities of Automatic Extraction of University Webometric Information

    Get PDF
    The main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities’ websites. The information automatically extracted can be potentially updated with a frequency higher than once per year, and be safe from manipulations or misinterpretations. Moreover, this approach allows us flexibility in collecting indicators about the efficiency of universities’ websites and their effectiveness in disseminating key contents. These new indicators can complement traditional indicators of scientific research (e.g. number of articles and number of citations) and teaching (e.g. number of students and graduates) by introducing further dimensions to allow new insights for “profiling” the analyzed universities. The main findings of this study concern the evaluation of the potential in digitalization of universities, in particular by presenting techniques for the automatic extraction of information from the web to build indicators of quality and impact of universities’ websites. These indicators can complement traditional indicators and can be used to identify groups of universities with common features using clustering techniques working with the above indicators

    XIPE: the X-ray Imaging Polarimetry Explorer

    Full text link
    X-ray polarimetry, sometimes alone, and sometimes coupled to spectral and temporal variability measurements and to imaging, allows a wealth of physical phenomena in astrophysics to be studied. X-ray polarimetry investigates the acceleration process, for example, including those typical of magnetic reconnection in solar flares, but also emission in the strong magnetic fields of neutron stars and white dwarfs. It detects scattering in asymmetric structures such as accretion disks and columns, and in the so-called molecular torus and ionization cones. In addition, it allows fundamental physics in regimes of gravity and of magnetic field intensity not accessible to experiments on the Earth to be probed. Finally, models that describe fundamental interactions (e.g. quantum gravity and the extension of the Standard Model) can be tested. We describe in this paper the X-ray Imaging Polarimetry Explorer (XIPE), proposed in June 2012 to the first ESA call for a small mission with a launch in 2017 but not selected. XIPE is composed of two out of the three existing JET-X telescopes with two Gas Pixel Detectors (GPD) filled with a He-DME mixture at their focus and two additional GPDs filled with pressurized Ar-DME facing the sun. The Minimum Detectable Polarization is 14 % at 1 mCrab in 10E5 s (2-10 keV) and 0.6 % for an X10 class flare. The Half Energy Width, measured at PANTER X-ray test facility (MPE, Germany) with JET-X optics is 24 arcsec. XIPE takes advantage of a low-earth equatorial orbit with Malindi as down-link station and of a Mission Operation Center (MOC) at INPE (Brazil).Comment: 49 pages, 14 figures, 6 tables. Paper published in Experimental Astronomy http://link.springer.com/journal/1068
    • …
    corecore